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Abstract. Non-interference is a semantic program property that as- 
signs confidentiality levels to data objects and prevents illicit information 
, flows from occurring from high to low security levels. In this paper, we 

' present a novel security model for global non-interference which approx- 

imates non-interference as a safety property. We also propose a certifica- 
tion technique for global non-interference of complete Java classes based 
on rewriting logic, a very general logical and semantic framework that is 
efficiently implemented in the high-level programming language Maude. 
Starting from an existing Java semantics specification written in Maude, 
we develop an extended, information-flow Java semantics that allows us 
to correctly observe global non-interference policies. In order to achieve 
a finite state transition system, we develop an abstract Java semantics 
I that we use for secure and effective non-interference Java analysis. The 

• . analysis produces certificates that are independently checkable and are 

^ ' small enough to be used in practice. 

^ 1 Introduction 

> 

, Confidentiality is a property by vifhich information that is related to an entity 

' or party is not made available or disclosed to unauthorized individuals, entities, 

\ or processes. One way to protect confidential data is by establishing an access 

\^ • control policy [12] that restricts the access to objects depending on the iden- 

I tity or the role performed by the user, meaning that some privilege is required 

to access confidential data. A user might establish an access control policy by 
stipulating that no data that is visible to other users be affected by confidential 
data. Such a policy allows programs to manipulate and modify confidential data 
\ as long as the observable data generated by those programs do not improperly 

; I ' reveal information about the confidential data. A security policy of this sort 

I is called a non-interference policy [18j because confidential data should not in- 

terfere with publicly observable data. Thus, ensuring that a program adheres 
to a non-interference policy means analyzing how information fiows within the 
program. The mechanism for transfering information through a computing sys- 
tem is called a channel. Variable updating, parameter passing, value return, file 
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reading and writing, and network communication are channels. Channels that 
use a mechanism that is not designed for information communication arc called 
covert channels |38| . There are covert channels such as the control structure of 
a program, termination, timing, exceptions, and resource exhaustion channels. 
The information flow that occurs through channels is called explicit flow [TH] 
because it does not depend on the specific information that flows. The informa- 
tion flow that occurs through the control structure of a program (conditionals, 
loops, breaks, and exceptions) is called an implicit flow [18j because it depends 
on the value of the condition that guards the control structure. In this paper, 
we arc interested in both explicit and implicit flows for non-interference analysis 
of deterministic Java programs. However, we do not consider covert channels 
such as termination, timing, exceptions, and resource exhaustion channels, i.e., 
releasing information through termination or non termination of a computation, 
through the time at which an action occurs, or by the exhaustion of a finite 
shared resource such as memory. 

In [ll2j , we proposed an abstract methodology for certifying safety properties 
of Java source code. It is based on Rewriting logic (RWL) and is implemented in 
Maude [Mj, which is a high-performance language that implements RWL [32] . 
In [T], we considered integer arithmetic properties that we analyzed as a safety 
property, whereas in [5] we dealt with (local) non-interference of Java methods. 
Non-interference is usually defined as a hyperproperty [T3|, i.e., a property de- 
fined on a set of sets of traces, and cannot be established by simply checking 
a (safety) property on a set of runs (essentially, no single run of a system can 
violate non-interference). However, we are able to analyze non-interference by 
observing a stronger property which can be checked as a safetjH property using 
an instrumented flow sensitive semantics. 

The methodology of [ll2j is as follows. Consider a Java program together with 
a specification of the Java semantics. The Java program is a concrete expression 
(i.e., term) that represents the initial state of the Java interpreter running the 
considered Java program. The Java semantics is a specification in Maude. Given 
a safety property (i.e., a system property that is defined in terms of certain events 
that do not happen [30]), the unreachability of the system states that denote the 
events that should never occur allows us to infer the desired safety property. Un- 
reachability analysis is performed by using the standard Maude (breadth-first) 
search command, which explores the entire state space of the program from an 
initial system state. In the case where the unreachability test succeeds, the corre- 
sponding rewriting proofs that demonstrate that those states cannot be reached 
are delivered as the expected outcome certificate. Very often the unreachability 
test docs not succeed because there is an infinite search space; thus, we achieve a 
finite search space by using abstraction [13] . In our methodology, certificates are 
encoded as (abstract) rewriting sequences that (together with an encoding of the 
abstraction in Maude) can be checked by standard reduction. Our methodology 
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is an instance of Proof-carrying code (PCC), a mechanism originated by Necula 
|36j for ensuring the secure behavior of programs. 

This article provides a comprehensive and full-fledged formulation of the 
abstract non-interference certification methodology of |2]. In that work, we fo- 
cused on the methodology as well as the PCC and rewriting-based particulars 
of our approach with a specific emphasis on practicality and good performance. 
This paper, however, formalizes more foundational semantic security aspects, 
namely: (i) the characterization of non-interference as a safety property on ex- 
tended Java computations; (ii) the conditions required by Java programs in order 
to ensure the correctness of our methodology; (iii) the observational capabilities 
of an attacker; and (iv) the soundness of our abstract non-interference analy- 
sis technique. In our previous work [5], we analyzed (local) non-interference of 
Java functional methods (i.e. methods that return values). However, in this pa- 
per, we are able to analyze entire Java programs, and thus, we consider global 
non-interference. 

This paper is organized as follows. In Section [5J we recall the notion of non- 
intereference and describe a mechanism to specify non-interference policies in 
JML. In Section [3J we recall the specification of the Java semantics in rewriting 
logic. In Section |4j we extend this semantics to handle confidential information 
and formulate a non-interference certification methodology that is based on the 
unreachability of undesired states in the extended semantics. In other words, by 
using the extended, information-fiow Java semantics, we are able to correctly 
observe global non-interference policies by checking a stronger safety property 
that, in our framework, implies non-interference. In Section [SJ we develop an 
approximation of the extended Java semantics that produces a finite search 
space for any input Java program. By using this abstract semantics (which we 
implement as a source-to-source transformation of the extended semantics in 
Maude) we formulate our non-interference analysis and prove its soundness. We 
include some experiments in Section |51 A thorough discussion of related work is 
presented in Section [71 Finally, Section |8] presents our conclusions. 

2 Non-interference 

A non-interference policy establishes a confidentiality level for each source pro- 
gram variable of primitive datatypes. It guarantees that actual values of variables 
with a higher confidentiality level do not infiuence the output of a variable with 
a lower confidentiality level during program execution |18l26l38l9l43ll9j . It is 
implicitly assumed that constants that appear in a program always have the 
lowest confidentiality level (i.e., the considered program is authorized to access 
secret data, but it does not contain secret data in its code). 

A non-interference policy can be represented by a partially ordered set 
{Labels, <) and a labeling function Labeling : Var — > Labels, where Labels 
is the finite set of confidentiality levels, < is a partial order between confiden- 
tiality levels, and Var is the set of source program variables |42I5I27| . There are 
usually two confidentiality levels: Labels = {Low, High}. These represent pub- 
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lie non-sccrct data (low confidentiality) and secret data (high confidentiality), 
respectively. (Labels, <) forms a lattice where Low is the greatest lower bound 
or bottom clement (±). High is the least upper bound or top element (T), and 
Low < High. The join operator (U) is defined as Low U Low = Low; otherwise, 
X\JY = High. Enforcing non-interference means that the values of High-labeled 
source variables cannot flow to Low-labeled source variables, whereas the values 
of Low-labeled source variables can flow to High-labeled source variables. The 
attacker model for global non-interference that we formalize below assumes that 
the attacker is passive and can only see the Low-labeled source variables of the 
Java program at the initial and final states and not at the intermediate states. 
Our methodology can certify programs that have temporal breaches and are still 
non-interferent . 

In order to express confidentiality policies, we use the Java modeling language 
JML [2n], which is a property specification language for Java modules. As an 
interface specification language, JML can describe the names and static infor- 
mation found in Java declarations of Java modules with preconditions (requires 
clauses), postconditions (ensures clauses), and assert statements (assert 
clauses), all of which express first-order logic statements. As a behavior specifi- 
cation language, JML can describe how the module behaves when assertions are 
intermixed within the Java source code. The text of an annotation can either be 
in one line after the //O marker, or in many lines enclosed between the markers 
/*(S and a*/. They are ignored by traditional compilers. The initial confidential- 
ity level of a variable in a Java program is written with the word setLabel as 
a JML annotation (e.g. setLabel(var, High)). The confidentiality label of pro- 
gram variables is Low if nothing is specified (i.e., program variables are public 
by default). We do not need to specify the label of either the formal parameters 
or local variables because they can be inferred from the confidentiality labels 
of other program variables if they are properly initialized. These JML annota- 
tions, together with the default assumption, define the labeling function of the 
non-interference policy. 

Example 1. Consider the following Java program borrowed from [T7] that models 
a bank account and the initial state given by the execution of the function main: 

public class Account ■[ int balance; //@ setLabel (balance, High); 
public boolean extraService ; 

public Account { balance = 0; extraService = false; } 
public void writeBalance(int amount) {. balance = amount; 

if (balance>=10000) extraService=true ; else extraService=f alse ; } 
private int readBalance () -[return balance;}- 
public boolean readExtraO -[return extraService;}- 

} 

class System -[ static Account a = new Account (); 
public static void main(String [] args) -[ 

int initbalance; //@ setLabel (initbalance , High); 
initbalance = Integer .parseint (args [0] ) ; 

a.writeBalance(initbalance) ; System, out .println(readExtra() ) ; }-} 
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This non-interference policy specifies tliat the object field balance of the global 
object a and the initialization parameter initbalance (i.e., args[0]) hold secret 
data. This program is insecure w.r.t. this policy since an observer with low 
access rights can obtain partial information about the variable balance via an 
observation of the non-secret variable extraService. 

We assume a fixed Java program Pjava- V^a?'s(Pjava) denotes the set of static 
source variables that may be initialized by the main function call. We denote 
the set of Low program variables as ioz«(Pjava) = {var 6 l^a?'s (^'java) | 
Labeling{var) = Low}. A program state St is a set of value assignments to 
program variables. Given var £ l^a7's(Pjava) and a state St, St[var] denotes the 
value of variable var in St. We model a Java program Pjava as a state transi- 
tion system between pairs {P,St), where P is the current, still-to-be-executed 
part of the Java program Pjava and St represents the current program state. 
(-Pjava, Sto) denotes the initial configuration of standard program execution and 
{/ , St) denotes a final configuration, where ■/ stands for the empty program. 
Note that we assume that every Java program properly terminates for each set 
of input data (i.e., we do not consider non-terminating programs, deadlocks, or 
runtime errors). We also assume deterministic Java programs, without threads 
or exceptions. M^java is the transition relation that describes any possible one- 
step transition between any two Java program states. An execution (or trace) of 

-Pjava is a sequence (Pjava,S'to) ^-^Java •■• {Pi,St^) l->Java ■•■ ^-^Java {•/ , St^) , 

which is simply denoted by (Pjava, Sto) {/ ,Sn) if the intermediate states 

are irrelevant. We can also abbreviate {/ ,Sn) by (5„). 

We define program non-interference by using an equivalence =low relation- 
ship between states |38I42I5I?T] . Roughly speaking, non-interference establishes 
that any two terminating runs of a program that start from indistinguishable 
initial states produce indistinguishable final states. 

Definition 1 (State equality [38] ). Given a Java program Pjava, two states 
Sti and St2 for Pjava indistinguishable at the confidentiality level Low, writ- 
ten Sti =Low St2, if for all var € Low{Pji^vi^), Sti[var] = St2[var]. 

What the attacker can see from a final state is determined by a relation «low 
Two executions of a program Pjava are related by ~lom if they are indistinguish- 
able to the attacker [35]. The notion of non-interference is therefore paramet- 
ric on R^Low A program is non-interferent if, whenever different initial program 
states are indistinguishable at level Low, this implies that the corresponding final 
states are also indistinguishable at level Low. 

Definition 2 (Non-interference |38]). A Java program Pjava is 
non-interferent if for every pair of different program initial states Sti and St2, 
and for their corresponding final program states St'^, St'2 such that 

(Pjava,5'tl) l->java ('^'^l) and {P^^^^, St2) t^^java {St'^) , WC haVC that Sti ^ Low 

St2 implies St'i p^low St'2- 



5 



In this paper, we follow the standard approach in the literature that considers 
St ~Low St' iff St —Low St' . Then, the non-interference condition of Definition 
[2] is understood as the lack of any strong dependence |38| of Low-labeled variables 
on any of the High-labeled variables. 

3 The Rewriting Logic Semantics of Java 

In the following, we briefly recall the rewriting logic semantics of Java that was 
originally given in [5T] and also used by the JavaFAN verification tool |20I22| . We 
refer the reader to |33j for further technical details on rewriting logic semantics. 

In a sufficiently large subset of full Java 1.4 language is specified in 
Maude, including inheritance, polymorphism, object references, multithreading, 
and dynamic object allocation. However, Java native methods and many of the 
available Java built-in libraries are not supported. The specification of Java op- 
erational semantics is a rewrite theory: a triple T^java = (-S'java, -Ejava, -Rjava) 
where Sjava is an order-sorted signature] iJjava = ^.lava W Bj^va is a set of 
■^java"equational axioms where Bjava are algebraic axioms such as associativity, 
commutativity and unity, and Z\java is a set of terminating and confluent (mod- 
ulo Bjava) equations. Finally, i?java is a set of ITjava^rewrite rules that are not 
required to be confluent nor terminating. 

Intuitively, the sorts and function symbols in Sjava describe the static struc- 
ture of the Java program state space as an algebraic data type; the equations 
in Zijava describe the operational semantics of its deterministic features; and 
the rules in Rjava describe its concurrent features. Following the rewriting logic 
framework |41l32j . we denote by u — J'java ^ the fact that the concrete terms 
u, V (which denote Java program states) are rewritten (at the top position, see 
|21j ) by using r, which is either a rule in Rjava or an equation in Ajava (both 
of which are applied modulo Bjava)- We simply write u — >java v when the ap- 
plied rule or equation is irrelevant. We denote by — >-java extension of — !>java 
to multiple rewrite steps (i.e., u — >-java if there exist ui,...,Uk such that 

U ->Java Ui -^Java U2 ■ ■ ■ Uk ->Java v) . 

The rewrite theory TZjava is deflncd on terms of a concrete sort State, with 
the main state attributes (represented by means of constructor symbols of the 
algebraic type State) such as f stack for handling function calls, Istack for han- 
dling loops, env for assignments of variables to memory locations, and store for 
assignments of memory locations to their actual values. They deflnc an algebraic 
structure that is parametric w.r.t. a generic sort Value that defines all the possible 
values returned by Java functions or stored in the memory. For instance, the int 
and bool constructor symbols describe Java integer and boolean values and are 
defined in Maude as "op int : Int — Value ." and "op bool : Bool — > Value ." , 
where Int and Bool are the internal built-in Maude sorts that define integer 
and boolean data types. Intuitively, equations in Zijava and rules in i?java are 
used to specify the changes to the program state (i.e., the changes to the mem- 
ory, input/output, etc). Since we consider only deterministic Java programs, our 
specification of the Java semantics in rewriting logic contains only equations 
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eq k((E > E') -> K) = k((E, E') -> > -> K) . Evaluate arguments 

eq k((int(I), int(I')) -> > -> K) = k(bool(I > I') -> K) . —Resolve 

Fig. 1. Continuation-based equations for the Java greater-than operator on in- 
tegers 

First obtain location in store from variable name 

eq k(Var -> K) env([Var, Loc] Env) = k(#(Loc) -> K) env([Var, Loc] Env) . 

Then obtain value stored in this location 

eq k(#(Loc) -> K) store ( [Loc, Value] Store) 
= k(Value -> K) store ( [Loc .Value] Store) . 

Fig. 2. Continuation-based equations for variable content retrieval 

Obtain variable location and evaluate expression 

eq k(Var = E -> K) env([Var, Loc] Env) 
= k(E -> =(Loc) -> K) env([Var, Loc] Env) . 

Once the expression is computed, assign to location 

eq k(Val -> =(Loc) -> K) = k([Val -> Loc] -> (Val -> K)) . 
General procedure to update the memory 

eq k([Val -> Loc] -> K) store ( [Loc, Val'] ST) = k(K) store ( [Loc, Val] ST) . 
Fig. 3. Continuation-based equations for the Java assignment operator 

and no rules. The reader can find a RWL specification of the semantics of a 
programming language with threads in |33lll2j . 

The semantics of Java is defined in a continuation-based style [33j and speci- 
fied in Maude itself. Continuations maintain the control context, which explicitly 
specifies the next steps to be performed. The sequence of actions that still need 
to be executed are stacked. We use letters K, K' to denote continuation variables, 
letters E, E' to denote expressions to be evaluated, and Val, Val' to denote values 
(i.e., the result of evaluating an expression). Once the expression e on the top of 
a continuation (e -> k) is evaluated, its result will be passed on to the remain- 
ing continuation k. For instance, in Figure [TJ the Java greater-than operation 
on Java integers is specified by using continuations, where k is the constructor 
symbol used to denote a continuation, -> is the constructor symbol used to con- 
catenate continuations, bool is the constructor symbol used to denote a Java 
boolean data, and int is the constructor symbol used to denote a Java integer 
number. 

One important aspect of the semantics is the handling of Java variables. In 
Figure [5J we show how the contents of a Java variable are retrieved from the 
store (or memory) in the Java state. The semantics of the assignment opera- 
tor for the Java variables is specified in Figure [31 The if-then-else statement is 
shown in Figure HI The semantics of while statements (loops) is specified in Fig- 
ure [5l where the term while E S denotes the Java iteration statement, the term 
while(E, S) denotes both the while continuation and the while statement that 
is expressed in terms of the if (S, S') continuation, and Istack denotes a stack 
of loops currently being executed, which is needed for a proper control of the 
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Evaluates boolean expression keeping the then and else statements 

eq k((if E S else S') -> K) = k(E -> (if(S, S') -> K)) . 
eq k(bool(true) -> (if (S, S') -> K)) = k(S -> K) . 
eq k(bool(false) -> (if (S, S') -> K) ) = k(S' -> K) . 

Fig. 4. Continuation-based equations for if-then-else statement 

Stack loop and transform while expression into while continuation 

eq k( (while E S) -> K) Istack(Lstack) 
= k(while(E,S) -> popLStack -> K) lstack(while(E,S) -> K, Lstack) . 

A while continuation is transformed into an if-then-else 

eq k(while(E,S) -> K) = k(E -> if(S while ( E , S ),{}) -> K) . 
Add semantics for popLStack 

eq k(popLStack -> K) lstack(LItem, Lstack) = k(K) Istack(Lstack) . 

Fig. 5. Continuation-based equations for while statement 
The state is restored from the loop stack 

eq k(break -> K) lstack(while(E,S) -> K' , Lstack) = k(K') Istack(Lstack) . 
Fig. 6. Continuation-based equations for while break statement 

Java break statement. Figure |5] shows the semantic specification of the break 
statement, that simply pops the stack of loops. This is important, since it can 
also abruptly change the information flow. Method calls are not shown in this 
paper; their semantics is simply defined by eager evaluation of all arguments of 
the method (whose values are stored in new memory locations) and by creating 
a new local environment that contains location assignments for formal method 
parameters and local variables. Due to space limitations we do not discuss heap 
manipulation here. We refer the reader to |33j for further details. 

The following example illustrates the mechanization of the Java semantics. 

Example 2. Consider again the Java program of Example [T] and two program 
executions, respectively fed with 5000 and 10000 for the initialization parameter 
initbalance. Note that the corresponding initial states are indistinguishable at 
the Low confidentiality level (e.g. the only Low-labeled variable, extraService, 
is set to false in both of them). The Maude command search provides built-in 
breadth-first search. We ask for the final Java program state of each execution 
trace (actually, in order to visualize the results, we show the output of print In 
Java instructions). The Maude terms EXl-MAUDE and EX2-MAUDE stand for the 
Java program with the corresponding initial call (for input value 5000 and 10000, 
respectively), which are compiled into a Maude expression by using a suitable 
Java wrappei|f|: 

search in PGM-SEMANTICS : 

java( (preprocess (EXl-MAUDE) noType . 'main < new string [i(0)] > noVal)) 
=>! JO: Output . 

* Available at |http: //f si . cs .uiuc . edu/index.php/Rewriting_Logic_Semantics_of _Java[ 
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Solution 1 JO: Output — > pi (bool (false) ) 
No more solutions. 

search in PGM-SEMANTICS : 

java( (preprocess (EX2-MAUDE) noType . 'main < new string [i(0)] > noVal)) 
=>! JO: Output . 

Solution 1 JO: Output — > pi (bool (true) ) 
No more solutions. 

If the attacker observes these two final states, she will appreciate the two different 
values for the variable extraService. 

4 Proving Non-interference by using an Extended 
Instrumented Semantics 

Non-interference is usually understood to be a security property and is therefore 
defined as a hyperproperty |13] (i.e., a property defined on a set of sets of traces). 
For instance, in Example [51 the verification process for non-interference should 
check the (possibly infinite) set of (possibly infinite) sets of final states issued 
from the (possibly infinite) sets of indistinguishable initial configurations. Note 
that checking the final states issued from EXl-MAUDE and EX2-MAUDE is just one 
of the combinations to be analyzed. In contrast, the verification process for a 
safety property should simply check the traces issuing from the (possibly infinite) 
set of initial configurations, which is simpler. 

In this paper, we prove non-interference as a safety property by instrumenting 
the Java semantics in order to dynamically keep track of the change of the confi- 
dentiality labels of program variables. Intuitively, the semantic instrumentation 
is defined as follows: 

1. Attach a confidentiality label to each memory location; this allows us to 
observe their confidentiality level at the final execution state. 

2. Attach a confidentiality label to the evaluation of program expressions; this 
allows us to know whether the evaluation of an expression involves high 
confidentiality data. 

3. Associate a confidentiality label to the evaluation of program statements, 
particularly those involving conditional expressions or guards; this allows us 
to determine whether the control flow at a given execution point depends 
on the actual value of high confidential variables. However, this label is not 
attached to each program statement. Rather it is kept as an extra attribute 
of a state in the extended Java semantics. This corresponds to the notion of 
a context label being updated after each evaluation step in [18128127] . which 
is introduced in the following example. 

Example 3. Consider the following JavfH program TestClass that is borrowed 
from [55]. We endow it with the attached non-interference policy: 

^ We omit the semantics of some Java operators such as _++, ++_, and _+=_, since they 
can be defined in terms of addition (_+_) and assignment (-=-), as usual [55] . 
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public class Testclass ■[ static int low=0, high; //@ setLabeKhigh, High); 
public static void main ( String [] args) { 

high = Integer .parseint (args [0] ) ; while (high > 0) {high — ;low++;} }■} 

Here there is an an illicit and implicit information flow from the High-labeled 
source variable high to the Low-labeled source variable low. For instance, when 
the variable high contains the value or 1, the variable low is assigned the value 
and 1, respectively. This implicit flow would be detected using the context 
label, which is set to High after evaluating the expression high>0, and which 
forces variable low to be set to High independently of the confidentiality level 
of the expression low++. 

In contrast to [2] where local non-interference was studied, here we consider 
global non-interference (i.e., we are able to ensure a non-interference policy at 
the final state of the whole Java program execution, which contains several meth- 
ods, classes, and function calls). This important improvement in the verification 
power (which has been hardly explored in the related literature) requires the fol- 
lowing two modifications to the non-interference analysis of [2]. These changes 
avoid the difficult (or costly) process of tracing the current confidentiality label 
of a memory location back to the point where this location was created. 

1. We introduce an additional confidentiality label (Low ^ High), which al- 
low us to represent not only the current confidentiality label of a memory 
location but also to keep track, at a global level, of hazardous transitions 
from an initial confidentiality label Low to High. Similarly, we introduce the 
confidentiality label (High ^ Low), in order to avoid false positives where a 
High-labeled variable is updated with the value of a Low-labeled expression 
and then updated again with the value of a High-labeled expression. 

2. In [2j, we used the context label only when updating the value of a variable 
in memory, as in [28I43I27I24| . and when returning values as in [23]. In this 
paper, we use the context label during expression evaluation, as in [S]. 

We describe the information-flow extended version of the rewriting logic se- 
mantics of Java by the rewrite theory T^javaE = (^.javaE , ^'javaE , -RjavaE ), £^javaE = 
^javaE W ^javaE and its Corresponding — )'javaE rewriting relation. In the new se- 
mantics, program data not only consist of standard concrete values but each 
value is decorated with its corresponding confidentiality label. Formally, we con- 
sider the label change LabelChange ~ {Low ^ High, High ^ Low} so that the 
domain of program variables in the extended semantics is Value x {Labels U 
LabelChange). We write <Value,LValue> for a pair consisting of a concrete 
value and its corresponding confidentiality label in Labels U LabelChange. 

Thanks to the modularity of the rewriting logic approach to formalizing pro- 
gram semantics [H] , our changes to the semantics of Section [3] are incremental 
and minimal. As Figures [7| and Figure [5] show, the evaluation of constants and 
variables now uses the context label. As Figure [9] shows, the assignment com- 
putes the new confidentiality label in terms of the previous label at the memory 
location, namely NewVal = LVal' >» LVal. The new operator ^ is defined in 
Figure [H 
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eq k(i(I) -> K) lenv(CL) = k(<iiit (I) , CL> -> K) lenv(CL) . 
eq k(b(B) -> K) lenv(CL) = k(<bool(B) ,CL> -> K) lenv(CL) . 

Fig. 7. Extended equations for extended constant evaluation 

First obtain location in store from variable name 

eq k(Var -> K) env([Var, Loc] Env) = ... . 

Then obtain value stored in this location 

eq k(#(Loc) -> K) store ( [Loc, <Val,LVal>] Store) lenv(CL) 
= k(<Val,LVal join CL> -> K) store ( [Loc , <Val ,LVal>] Store) lenv(CL) . 

Fig. 8. Extended equations for variable content retrieval 

Obtain variable location and evaluate expression 

eq k(Var = E -> K) env([Var, Loc] Env) = . . . . 

Once the expression is computed, assign to location 

eq k(<Val,LVal> -> =(L) -> K) 

= k( [<Val,LVal> -> L] -> (<Val,LVal> -> K )) . 
General procedure to update the memory 

eq k( [<Val,LVal> -> Loc] -> K) store ( [Loc, <Val' ,LVal'>] ST) 
= k(K) store ( [Loc, LVal' »> LVal] ST) . 

Fig. 9. Extended equations for the Java assignment operator 



Previously Stored Label 3^ New Label = New Stored Label 



L 


2^ L = 


L 


Low 


High = 


Low 2> High 


High 


::g> Low = 


High S> Low 


Li > L2 


Li = 


Li 


Li » L2 


L2 = 


Li > L2 



Fig. 10. Updating memory locations 



The context label can only change due to a conditional control flow state- 
ments. According to [1815128127] . the evaluation of its boolean guards returns 
a confidentiality level that is associated to the resulting true or false value 
and, possibly, a modified context label. The extended semantic equations for 
the if-then-else of Figure 0] need some slight revision, which is motivated by the 
following example. 

Example 4- Consider the following Java method, where the value computed for 
the variable low does not actually depend on the value of the high confidentiality 
variable high (which only affects the temporal variable aux). This program does 
fulfill the non-interference policy at the final state, which can be proved by using 
our non-interference verification methodology. 

class Testclass { static int low=0, high; //(§ setLabeKhigh, High); 
public static void main(String [] args) { 
high = Integer .parseint (args [0] ) ; 

int aux=0; if (high > 2) aux = 1; else aux = 0; low = 0; } } 
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Evaluates boolean expression keeping the then and else statements 

ceq k((if E S else S') -> K) lenv(CL) 

= k(E -> (if(S, S') -> restoreLEnv(CL) -> K) ) lenv(CL) 

if not break-or-continue(S) and not break-or-continue (S ' ) . 
ceq k((if E S else S') -> K) lenv(CL) = k(E -> (if(S, S') -> K)) lenv(CL) 

if break-or-continue (S) or break-or-continue (S ' ) . 
eq k(<bool(true) ,LVal> -> (if(S, S') -> K)) lenv(CL) 

= k(S -> K) lenvCCL join LVal) . 
eq k(<bool(false) ,LVal> -> (if(S, S') -> K) ) lenv(CL) 

= k(S' -> K) lenvCCL join LVal) . 

New equation to restore previous context label 

eq k(restoreLEnv(CL) -> K) lenv(CL') = k(K) lenv(CL) . 

Fig. 11. Extended equations for the if-then-else 

In order to avoid false positives during the evaluation of conditional state- 
ments, we dynamically restore the previous context label after its execution. 
The extended semantics equations for the if-then-else are shown in Figure llll 
where a new continuation symbol restoreLEnv is used to restore the previous 
confidentiality label. However, restoring the previous context label has to be 
carefully considered in the presence of break or continue statements within 
a loop, since they can abruptly change the information flow as shown in the 
following example. 

Example 5. Consider a variation of Example [3] where the while loop has a bogus 
guard together with a break statement to exit the loop: 

public class Testclass ■[ static int low=0, high; //@ setLabeKhigh, High); 
public static void main (String [] args) {high = Integer .parseint (args [0] ) ; 
int aux=0; while (true) -[high — ; low++; if (high == 0) break;}- } ]■ 

As in Example |3l when the while loop ends, the variable low has the initial value 
of the variable high. Whenever high ^ 0, the break statement is not executed. 
In this case, the conditional guard uses High-labeled data, and the conditional 
statement should not restore the previous context label. In other words, the 
critical component here is not the break statement but rather the else branch 
that does not contain the break. 

In order to solve this problem, we check in Figure [TT] whether either of the 
two branches of a conditional statement contains a break or continue statement 
and no other conditional statement or while loop in between. If there is such a 
statement, restoreLEnv is not used. This case was not considered in }43| or in 

which only considered break statements within High guarded while loops. 

Method invocation propagates the context label without changes as proposed 
in [28] and, thus, is not shown here. Since while statements were expressed in 
terms of if-then-else statements, they need a slight extension to introduce the 
restorelEnv continuation (shown in Figure [T^ . The semantic specification of 
the break statement stays the same as shown in Figure [51 the context label 
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stack loop and transform while expression into while continuation 

eq k( (while E S) -> K) Istack(Lstack) lenv(CL) 
= k(while(E,S) -> restoreLEnv(CL) -> popLStack -> K) 
lstack(while(E,S) -> K, Lstack) lenv(CL) . 

Fig. 12. Extended equations for while statement 

lenv(CL) is not modified and the restoreLEnv expression introduced by the 
while statement is removed. 

4.1 Proving non-interfence as a safety property 

Now, we are ready to formulate a novel characterization of non-interference that 
allows us to check it as a property that is verified for each possible execution 
trace instead of being verified for each set of indistinguishable execution traces. 

Definition 3 (Strong Non-interference). A Java program Pjava is strongly 
non-interferent for a given labeling function if for every extended initial state Stf 
and for its corresponding final program state S't|' given by (Pjava, Stf) ^j^vaE 
{Stf), we have that for all var G Low(Pjava), Stf[var] = {Val, Low) for a value 
Val. 

Since in our model, a public variable can only have the label Low or the label 
Low ^ High, this means that in the extended execution of a program that 
is not strongly non-interferent, the label of at least one program variable is 
Low ^ High. Given an initial state St and a given labeling function, we denote 
the corresponding extended state by St^ . 

Lemma 1. Consider a Java program Pjava and two initial states Sti and St2 
such that Sti ~low St2. Consider the two corresponding final program states St'^ 
and St'2 given by {P,^i,^i„ Sti) i-^java ('S'^i), (Pjava, S't2) >^,java {St'2) . If there 
exists var e Low{Pjava) such that St'^[var] ^ St'2[var], then 
(-Pjava, <5'if ) "^javaE {St^) and St^[var\ = (V al^ho^n 3> High) for a value 
Val. 

Proof. Consider the two traces Vi : {P.uva, Sti) '^'java (St'i) and 
T^2 ■ (Pjava, 'S'i2) '-^ja^a {St2) ■ Let {vori, . . . , vork} C Low(Pjava) bc those 
variables such that St'i[vari] ^ St'-^ar^ for all 1 < z < fc. Since we assume 
fc > 0, then there is at least one of those variables (say var\) and an assign- 
ment statement var\ = Ei that is executed at least once in one of the two 
traces (say Let n be the total number of assignments in I?i to variables 
{var I, . . . , vark\. Note that n is finite since execution traces are finite because 
of the termination assumption. Now, we prove the result by induction on n. 

1. (n = 1) Let us consider the last execution step in Vi where the assignment 
vari = El is executed. Then, it may happen that the assignment vari = Ei 
is also executed in X'2, or not. We consider these two cases separately. 
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(a) If vari = Ei is also executed in 2?2, then St'i[vari\ ^ St'2[vari\ implies 
that the values for Ei arc different in the two traces. Thus, expression 
El must contain at least one variable var' such that the actual values 
of var' are different in the two traces when the considered assignments 
to vari are executed. Since St'-^[var'] ^ [war'] and n = \, then var' ^ 
Low{Pja_va)- Therefore var' is a High confidentiality variable, hence it 
has a High label in our extended semantics. This means that the label 
Low 3> High is assigned to variable var.i (according to Figure [TU)) in Vi , 
and the conclusion follows. 

(b) If vari = El is not executed in 252, then St'i[vari] =/= St'2[vari] implies 
that the execution of this last assignment statement vari — Ei in Pi is 
conditioned to the result of a boolean expression containing High confi- 
dentiality variables that guards a conditional (or while loop) statement 
so that the assignment is executed in 2?i and not in 2?2 • Then, the assign- 
ment statement vari = Ei in Pi was executed either (i) within the then 
or else branch of an if-then-else Java statement (recall that while loops 
are expressed as if-then-else statements), (ii) within the then branch of 
an if-then Java statement, or (iii) after evaluating a conditional expres- 
sion within a while loop that includes a break expression. Note that no 
other case can generate an interference condition. In all three cases, our 
extended semantics assigns a High label to the boolean guard expres- 
sion of such a conditional expression and the context label is set to High 
(according to Figures [TT] and [T^]) before the expression Ei is evaluated 
in the statement vari = Ei. Note that in case (iii), the conditional ex- 
pression propagates the High context label outside itself (according to 
FigureO, i.e. the conditional does not restore the previous context label 
precisely to record that even if sequence 2?i does not execute the break 
statement, another possible trace (e.g. T>2) can do it. Finally, in all three 
cases, the expression Ei is evaluated within a High-labeled context and 
then the label Low ^ High is assigned to variable wari, independently 
of whether expression Ei manipulates High confidential data or not. 

2. (n > 1) Let us consider the last execution step in I?i where the assignment 
vari = Ei is executed, with I < i < k. We split into two cases, 
(a) If vari = Ei is also the last assignment of variables {vari, ... ,vark} 
executed in P2, then St'i[vari] ^ St'2[vari] implies that the values for Ei 
are different in the two traces. Thus, expression Ei must contain at least 
one variable var' such that the actual values of var' are different in the 
two traces when the considered assignments to vari are executed. Then, 
let us consider whether var' £ {vari, . . . , vark} or not. If it is, then by 
induction hypothesis, we can assume that variable var' has a Low ^ High 
label since we can replace the execution of the last assignment vari = Ei 
by a simple vari = cte (where cte is a constant) and the program will 
still be interferent, due to the fact that the assignment to var' occurs 
before and could not be affected by the last assigncment to vari. If 
var' {vari, ■ • ■ , war^}, then var' is a High confidentiality variable and 
it has a High label in our extended semantics. In both cases, the label 
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Low ^ High is assigned to variable vavi (according to Figure [TU]), and 
the conclusion follows, 
(b) If var.i ~ Ei is not the last assignment of variables {vari, . . . ^vark\ 
executed in I?2j then either there is no such an assignment in P2 to 
variables {vari, . . . ,vark\, or the last assignment in 2?2 has the form 
vaTi = E' , with E' different from Ei, or it affects a variable var" that 
is different from vari. All three cases imply that the execution of the 
last assignment statement vari = Ei in Pi is conditioned to the result 
of a boolean expression containing High confidentiality variables that 
guards a conditional (or while loop) statement so that such assignment 
is executed in 2?i and not in 2?2- Then this case is perfectly similar to 
case (l)(b) above, and the result follows. □ 

From Lemma[T]we derive that strong non-interference implies non-interference, 
as given by the following result. 

Theorem 1 (Strong Non-interference Soundness). Given a Java program 
-Pjavaj */ ^'java is strongly non-interferent (Definition 0), then Pjava is non- 
interferent (Definition\^. 

Proof. (By contradiction) Assume that program Pjava is strongly non-interferent 
and also that Pjava is interferent. Since Pjava is strongly non-interferent, for every 
extended initial state St^ and for its corresponding final program state St^ 
given by {Pja.vii, St^) <-^*j^^^e {St^'), we have that for all var G Low{Pje,ya), 
St^ [var] = {Val, Low) for a value Val. By Lemma [T] and the assumption that 
-Pjava is interferent we have that 5*^^ [var] = {Val, Low ^ High) for a value Val, 
hence Pjava is not strongly non-interferent, contradicting the hypothesis. □ 

The following example illustrates the mechanization of our verification method- 
ology. 

Example 6. Consider again the Java program of Example [T] Now, we compute 
the final state in the extended Java program execution for EX 1 -MAUDE (for sim- 
plicity we show only the value of variable extraBalance). 

search in PGM-SEMANTICS-EXTENDED : 

java( (preprocess (EXl-MAUDE) noType . 'main < new string [i(0)] > noVal)) 
=>! M: Store . 

Solution 1 M: Store — > store ( [1(6) ,<bool (false) , Low » High>] ...) 
No more solutions. 

The execution for EX2-MAUDE will also contain the label Low :p High for variable 
extraBalance. 

In other words, we transform non-interference into a stronger property which 
can be effectively checked in the extended semantics. Obviously, we arc not able 
to certify the security of all the programs that are secure, as shown in Example [71 
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Example 7. Consider the following Java program borrowed from |43| . 

class Testclass { static int low=0, high; //(§ setLabeKhigh, High); 
public static void main ( String [] args) {high = Integer .parseint (args [0] ) ; 
low = high; low = low - high;} }■ 

Apparently, there is an explicit flow from variable high to variable low through 
the two assignment statements. However for any execution, when program ends, 
the value of variable low is always so that the variable low does not depend 
on the variable high. According to Definition [2l the program is non-interferent. 
However, we give a false positive by using our notion of strong non-interference 
since the assignment "low = high" assigns to the variable low a high confiden- 
tiality label Low 3> High and the last statement "low = low — high" does not 
revert the label back to low. 

The program of Example [7] cannot be verified by traditional type inference 
approaches |42I46I4| either, since they fail to verify (type check) any program 
with temporary breaches, e.g. Examples S] and [7] above, whereas Example H] is 
effectively verified by using our methodology. 

5 Approximating Non-interference by using an Abstract 
Semantics 

The extended, instrumented Java semantics defined so far allows us to develop 
a technique for proving non-interference. However, this technique is still not 
feasible in general because there are too many possible initial states to consider 
for the safety property to be checked. In the following, we develop an abstract, 
rewriting logic Java semantics that allows us to statically analyze global non- 
interference. Similar to the purpose of the abstract semantics is to correctly 
approximate the extended computations in a finite way. Given the extended 
Java semantics, where there are concrete labeled values, we simply get rid of 
the values in the abstract semantics, and use their confidentiality labels as the 
abstract values instead. 

In the following, we develop an abstract version of the extended rewriting 
logic semantics of Java developed in Sectional which we describe by the rewrite 

theory T^java* = (^.lava* > -^Java* 7 -Rjava*)) -E'java* = ^Java# W -^Java* ^ud itS 

corresponding — >java# rewriting relation. As in Sectional our approach for the 
abstract Java semantics consists of modifying the original theory T^java^ (taking 
advantage of its modularity) by abstracting the domain to Label sULabelChange 
and introducing approximate versions of the Java constructions and operators 
tailored to this domain. 

An abstract interpretation (or abstraction) |16| of the program semantics is 
given by an upper closure operator a : p(State) — s- p(State), which is mono- 
tonic (for ah SSti,SSt2 e p(State), SSti C 55^2 implies a{SSti) C a{SSt2)), 
idempotent (for all SSt G p(State), a(SSt) C a{a{SSt)))^ and extensive (for 
all SSt G p(State), SSt C a{SSt)). In our framework, each Java program 
state St G State is abstracted by its closure a{{St}). Our abstraction function 
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rl k(LVal -> (if(S,S') -> K) ) lenv(CL) => k(S -> K) lenv(CL join LVal) . 
rl k(LVal -> (if(S,S') -> K) ) lenv(CL) => k(S' -> K) lenv(CL join LVal) . 

Fig. 13. Abstract rules for the if-then-else 

a : p(State^) — > p(State^) is a simple homomorphic extension to sets of states 
of the function 2nd : Valuex (LabelsULabelChange) {Label sULabelChange), 
meaning that we disregard the actual values of data. 

In the abstract Java semantics, several alternative computation steps of 
~^JavaE are mimicked by a single abstract computation step of — J'java* > reflect- 
ing the fact that several distinct behaviors are compressed into a single abstract 
state (i.e. set of states) . The instrumentalization of the Java semantics for deal- 
ing with a set of states instead of one single state implicitly means too many 
modifications. Therefore, we adopt a different approach. When several ^-java^ 
rewrite steps are mimicked by a single abstract rewriting state leading to an 
abstract Java state, and those rewrite steps apply different rules or equations, 
we use concurrency at the Maude level. Despite the fact that our extended Java 
semantics contains only equations and no rules, the abstract Java semantics does 
contain rules in i?java# to reflect the different possible evolutions of the system. 

The abstract semantics is mainly a straightforward extension of the extended 
semantics. The only difference is that any set of equations that was confluent 
and terminating in the extended semantics but might become non confluent 
or non terminating in the abstract semantics is transformed into rules. As a 
representative example, the abstract rules associated to two of the equations of 
the extended semantics of the if-then-else statement are shown in Figure [T51 

Now, we are ready to formalize the abstract rewriting relation — s-java* : which 
intuitively develops the idea of applying only one rule or equation from the 
concrete Java semantics to an abstract Java state while exploring the different 
alternatives in a non-deterministic way. By abuse, we denote the abstraction of a 
rule a{{l}) — ?• Q:({r}) by a{{l} — > {r}). P.uva denotes the sort of Java programs 

-fjava (i.e. -Pjava ^ ^Java)- 

Definition 4 (Abstract rewriting). We define abstract rewriting — >java#C 

(Pjava X p(State^))x(7'java X p(State^)) &J/(Pjavai,5'S'ti) ^Java# {Pje^va^ , S St2) 
if3u e SSti,3v e SSt2 S.t. (Fjavai,'") ^JavaE (^Javaa,")- 

We denote by ^'java* ^^'^ extension of — >,iava# to multiple rewrite steps. 

Lemma 2. // (Pjava,S'if) (St^), then there exists SSt^ e p(State^) 

s.t. (PTava,«({5tf })) {S St^) and Stf G 55^3. 

Proof. (Sketch) Our abstraction consists of transforming equations into rules 
and getting rid of the value component of states. Since the transformation of 
a set of equations (which are confluent and terminating modulo axioms) into 
rules preserves the execution traces, and (by the monotonicity, idcmpotcncy, and 
extensitivity of the upper closure operator a) the removal of the value component 
of states does not eliminate execution traces either, then the conclusion follows 

□ 
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A program is non-interferent for a given labeling function if the abstract 
values (the confidentiality labels) of the Low variables in the final state of an 
abstract program execution do not have the label Low ^ High. 

Theorem 2 (Abstract Non-interference Soundness). Given a Java pro- 
gram Pjava; -Pjava is non-interferent (Definition]^ if for all SSti G p(State^) 
s.t. {Pjuva., SSti) H^jg^^g^^ (SSt2), for all St S SSt2, and for all variables 
var S Low(Pjava)j St[var] = (V^a^,Low) for a value Val. 

Proof. By contradiction. Let us assume that Pjava is not non-interferent . i.e., 
there exists Stf with (Pjava, Stf) >-^j^^^e {Stf) and var e Low(Pjava) s.t. 
St2[var] = {Val,L) for a value Val and L ^ Low. Since {Pjiiva.,Stf) '-^j^^^^e 
{St^), by LemmaH there exists SSt^ G p(State^) s.t. (Pjava, aCi^tf })) ^j^^^^^* 
{SSts) and S'if' G SSt^. This contradicts the assumption that for all St G SSts, 
and for all variables var G Low(Pjava), St[var] — {VaV , Low) for a value Val' . 

a 

The following example illustrates the mechanization of the Java non-interference 
analysis. 

Example 8. Consider again the Java program of Example [TJ By virtue of the 
abstraction, we consider just one abstract initial state that safely approximates 
any extended initial state and compute the corresponding abstract final states. 

search in PGM-SEMANTICS-ABSTRACT : 

java((preprocess(EXl-MAUDE) noType . 'main < new string [i(0)] > noVal) ) 
=>! M: Store . 

Solution 1 M: Store — > store ( [1(6) , Low » High] ...) 
No more solutions. 

Due to the transformation of some equations into rules in the abstract semantics, 
there may be several execution paths but all lead to the same abstract final state. 

6 Experiments 

Our methodology generates a safety certificate which essentially consists of the 
set of (abstract) rewriting proofs that implicitly describe the program states 
which can (and cannot) be reached from a given (abstract) initial state, as il- 
lustrated in Example [8] Since these proofs correspond to the execution of the 
abstract Java semantics specification, which is made available to the code con- 
sumer, the certificate can be unexpensively checked on the consumer side by 
any standard rewrite engine by means of a rewriting process that can be very 
simplified. Actually, it suffices to check that each abstract rewriting step in the 
certificate is valid and that no rewriting chain has been disregarded, which essen- 
tially amounts to using the matching infrastructure available within the rewriting 
engine. Note that, according to the different treatment of rules and equations 
in Maude, where only transitions caused by rules create new states in the space 
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state, an extremely reduced certificate can be delivered by just recording the 
rewrite steps given with the rules, while the rewritings using the equations are 
omitted. 

The abstract certification methodology described here has been implemented 
m Maud(i. The prototype system oS'ers a rewriting-based program certification 
service, which is able to analyze global confidentiality program properties related 
to non-interference. Our certification tool can generate three types of certifi- 
cates: (i) the full certificates consist of complete rewriting sequences including 
all rewrite steps; (ii) the reduced rules certificates only contain the rewrite steps 
that use rules; and (iii) the reduced labels certificates only record the labels of 
the used rules. 
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Full Cert, size (Kb) 


1134 


1251 
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Red. Rules Cert, size (Kb) 


6.1 
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21.1 
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21.3 


R,ed. Labels Cert, size (Kb) 


1.8 


1.8 


2.6 


3.7 


5.2 


Full Cert. Gen. Time (ms) 


10408 


23574 


29482 


45709 


84331 


Red. Rules Cert. Gen. Time (ms) 


7057 


7030 


7527 


8215 


9547 


Red. Labels Cert. Gen. Time (ms) 


7030 


6700 


7190 


8198 


9537 



Table 1. Code measures, certificate sizes, and generation times 

In Table [TJ we analyze three key points for the practicality of our approach: 
the size and complexity of the program code, the size of the three types of 
certificates, and the certificate generation times. The running times are given 
in milliseconds and were averaged over a sufficient number of iterations. We 
considered three code measures, the code size in LOC (lines of source code), the 
code size in bytes, and the eyclomatie complexity, which counts the execution 
paths of a program. The experiments were performed on a laptop with a Pentium 
M 1.40 GHz processor and 0.5 Gb RAM. 

Program 1 consists mainly of a simple non-interferent code example bor- 
rowed from |43I28| . The program has been structured into two classes. The first 
class has one secret variable and one public variable, a constructor method, 
two get methods, and a method that contains the non-interferent piece of code 
of [43l28j . The second class is the main class with four method invocations. 
Similarly, program 2 is a simple non-interferent example borrowed from |27] . 
It is structured into two classes. Program 3 includes three simple methods in 
two classes: the non-interferent method included in program 1, an interferent 
method borrowed from [43l28j . and another non-interferent method borrowed 
from |39j . The main method has a sequence of method invocations such that 
the last invocation calls a non-interferent method, and thus the entire program 
is non-interferent. Program 4 includes six simple methods, the three methods 
included in program 3 and three other interferent methods also borrowed from 
|43l28j . including a method with a while loop and a method that calls another 

® The tool is provided with a Web interface written in Java and is publicly available 
at ,http: //www. dsic .upv. es/users/ elp/toolsMaude/GlobalNI -hml^ 
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method. In this case, the last invoked method as well as the whole example 
program arc non-interferent. Similarly, program 5 includes nine simple methods, 
the six examples included in program 4 plus three other intcrferent methods: 
two intcrferent variations of the loop example of program 5 and an intcrferent 
method with a return statement within a conditional statement. The source code 
of our benchmarks is provided within the distribution package. 

The experiments arc very encouraging since they show that the reduction in 
size of the certificate is very significant in all cases, with the quotient "Red. Rules 
Cert. Size/Full Cert. Size" ranging from 0.54% in program 2 to 0.09% in program 
5. Note that the biggest reduction occurs for the largest program. When the time 
employed to generate the full and reduced rules certificates are compared, the 
reduced certificate generation time vs the full certificate generation time range 
from 11, 32% to 67.80%. The reduction for the biggest example (program 5) was 
the largest one (11,32%). Note that the generation time for the reduced labels 
certificate were not significantly lower than the reduced rules certificate. These 
results show that the technique scales up better when reduced certificates are 
considered. 

7 Related Work 

Goguen and Meseguer [26j formalized non-interference of deterministic and ter- 
minating systems as a system hyperpropcrty [13] . i.e., a security property that 
is defined for pairs of system output traces that are indistinguisablc for an ob- 
server. In [23], Foccardi and Gorricri defined a stronger, security-based notion of 
non-interfcrcncc that considers pairs of system input/output traces. In contrast 
to |23j , our safety-based notion of strong non-interference only considers secret 
outputs, similarly to pS] . 

Barthe et. al [5] develop a methodology to prove non-interference of deter- 
ministic terminating programs in an imperative language with loops, condition- 
als, and mutable data structures (i.e. objects). Their methodology rclyics on 
using Hoarc logic and separation logic, and handles non-interference as a safety 
property by using program self-composition with variable renaming (i.e., they 
compose a program with a copy of itself without sharing memory positions). 
Their method can verify non-interference of secure programs with temporary 
breaches such as "low = high; low = 2", whereas imprecise conservative type 
systems i42l46l4j cannot. Also, their method can deal with Examples H] and [71 
whereas we cannot ensure security for the last example. This proposal is complete 
and sound, but the criterion is undecidable, and for the best of our knowledge 
no approximation has yet been implemented. 

Existing Java verification tools that use standard JML [52] as a property 
specification language do not support non-interference certification. Some so- 
phisticated non-interference policies can be expressed by using the JML ex- 
tensions of the Krakatoa Java verification tool [H] . These JML extensions were 
developed for Hoare-style assertions regarding program self-composition [B] . This 
means duplicating the code of the program and makes it necessary to distinguish 



20 



the same program variables in its two runs. These JML extensions are used to 
express non-interference pre- and post-conditions, but they do not handle confi- 
dentiality labels of program variables explicitly; the method assumes that all the 
variables annotated with the extended JML assertions called "nil" and "ni2", 
are labeled Low. Nevertheless, the confidentiality aspect of non-interference is 
expressible using the JML specification pattern suggested in j28l43j as an in- 
strument for program verification using the theorem prover PVS. Unfortunately, 
this proposal abuses notation by identifying the confidentiality levels with the 
values of program variables, and it does not consider important Java features 
such as method calls and interruptions (break, return or continue statements) 
within conditional instructions and iterations. Moreover, a specification pattern 
for confidentiality cannot be created in all cases, as mentioned in |43| . A flow- 
sensitive and termination-insensitive analysis for object-oriented programs based 
on Hoare logic is proposed in Amtoft et. al [3]. This analysis considers pointer 
aliasing that can leak confidential information. The non-interference property 
is specified by using independence assertions that are written in JML. In order 
to compute postconditions, the analysis uses an algorithm that is sound and 
complete given some assumptions, but it does not generate a program security 
proof. 

Although non-interference has not been considered in current PCC imple- 
mentations, there are some proposals that are based on type systems for a subset 
of Java [7], Java bytecode [371918] . and simple imperative languages |42l27lllj . 
None of these use JML to express non-interference policies and none of them have 
yet been implemented. In [7|, a type system is proposed as a basis for deriving 
a certifying compiler for a subset of Java source code with objects, inheritance, 
methods and simplified exceptions. JFlow [34| and Jif are security-typed 
programming languages with support for enforcing information-fiow and access 
control with dynamic label policies, at both compile time and run time. These 
compilers produce secure Java source code for verified programs. In order to 
deal with program variables whose confidentiality labels are only known at run 
time, dynamic labels are introduced. However, the dynamic labels of Jif have not 
yet been proved to enforce secure information fiow |47| . Volpano et al |42] devel- 
oped an information-fiow type system that can be used to check non-interference 
of programs written in a generic deterministic sequential imperative language, 
but this system cannot verify safe programs that have temporary breaches. In 
|27j . Hunt and Sands propose a fiow sensitive, dynamic type system that has 
not yet been implemented. It tracks syntactical dependences between program 
variables in a simple imperative language without objects or function calls. Al- 
though we consider only two security levels, our methodology can easily been 
extended to the multilevels of confidentiality of |27I8| . Moreover, we have shown 
that our analysis can achieve more precision than traditional, type-based ap- 
proaches, thanks to the combination of static analysis and dynamic labeling. In 
[5], Barthe et al. define the first information-flow type system for a sequential 
JVM-like language with classes, objects, arrays, exceptions and method calls that 
certifles non-interference in type-checked programs. The soundness was proved 
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by using the theorem provcr Coq, and a certified lightweight bytecode verifier 
for information flow was extracted from the proof. 

Wasserrab et. al present in [33] the first machine-checked correctness proof 
for information-flow control that is based on program dependence graphs using 
static intraprocedural slicing. The proof is formalized in Isabelle/HOL. The anal- 
ysis applies to deterministic terminating programs and is flow-sensitive, object- 
sensitive and context-sensitive. The machine- checked proof was instantiated for 
a simple imperative language with loops and for a subset of Jinja (a definition 
of Java bytecode), which must be manually annotated with security labels. This 
work does not consider method calls, classes, or objects. Bavera and Bonelli [TU] 
present a flow-sensitive type system for verifying non-interference of bytecode, 
where class flelds may have different confidentality labels for different instance 
objects. This methodology does not consider method calls and it does not gen- 
erate checkable proofs. Moreover, as is usually the case in type-based analysis, 
once the object fields and the variable labels are determined, they remain fixed 
throughout the analysis. A proposal that deals with dynamic information-flow 
policies is |40j . This technique is based on runtime tracking of indirect depen- 
dencies between program points. While our confidentiality label tracking is also 
dynamic, our approach is based on static analysis rather than runtime monitor- 
ing, similarly to [27128] . 

Some proposals also exist for non-interference verification that are based on 
abstract interpretation j5l46l25l24l45j . However, these proposals do not gener- 
ate a certificate as an outcome of the verification process, and they do not use 
JML to express non-interference policies. The idea of first enriching the original 
semantics of the language by pairing each data value to its security level, and 
then approximating it by only considering the security level was also proposed 
in |5l46j . A similar idea is used in [23] , where an abstract information-flow sen- 
sitive collecting semantics, which is called instruction-level security typing, for 
programs with dynamic structures is proposed; here input and ouput channels 
are given security levels, but the variables have no associated security levels. A 
different notion of abstract non-interference is proposed in [25] that approxi- 
mates the standard notion of non-interference by making it parametric relative 
to input/output abstractions. In abstract non-interference, the abstract domains 
encode the allowed flows that characterize the degree of precision of the knowl- 
edge of a potential attacker observing the data. By using classes and class hier- 
achies as abstract domains, Zanardini adopts a different perspective of abstract 
non-interference for classes in [35], where the abstract value of a concrete object 
is its class. Two objects (values) are indistinguishable at an abstraction level 
(class) if the objects belong to the given class or if the given class is a superclass 
of object classes. An algorithm for checking abstract non-interference of Java 
classes is proposed that relies on class-based dependencies. 

In previous work [2] , we dealt with (local) non-interference of function meth- 
ods regarding explicit inputs by parameter passing and explicit outputs by value 
returning. The local non-interference policies considered there were required to 
explicitly establish the confidentiality labels for all method parameters and vari- 
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ables. In this work, however, we consider global non-interference of complete 
Java classes and we do not need to explicitly state the confidentiality level for 
all program variables. In [2], we worked directly with an implementation level 
definition of non-interference; in this work, we provide a general and language- 
independent characterization as well as a formal and rigorous relation between 
the approximate properties and the security model. As in |18l5l38l28l27j , we take 
into account implicit information flows by considering the context confidentiality 
label in expression evaluation (the context label is joined with the confidential- 
ity label of the expression) and also by modifying the context label during the 
evaluation of guards of conditionals and while loops. Our global policies arc very 
flexible since the security levels of object variables, local variables, and method 
parameters may change temporarily as in [271281512416] . 

8 Conclusion 

In this paper, we formalize a framework for automatically certifying global non- 
interference of Java programs. Our methodology relics on an (abstract) extended 
semantics for Java written in rewriting logic that can be model-checked in Maude 
by using Maude's breadth-first search space exploration. In the extended seman- 
tics, non-interference becomes a safety property, and we formally demonstrate 
that the safety property in the extended semantics entails the semantic, non- 
interference security property in the standard Java semantics. In this work, we 
provide a general and abstract definition as well as a rigorous link between 
the approximate properties and the security model that we consider, whereas 
in our previous work [2], we worked directly with a program- level definition of 
non-interference. The proposed framework fully accounts for explicit as well as 
implicit fiows, and allows not only the inference of rewriting logic safety proofs 
but also the checking of existing ones, thus providing support for proof-carrying 
code. Actually, the steps that the abstract semantics takes arc recorded in order 
to construct a certificate ensuring that the program satisfies the desired prop- 
erty. By turning a potentially infinite labelled state space of a Java program 
into a finite abstract space, the abstract semantics not only makes the approach 
feasible, but also greatly reduces the size of the certificates that must be checked 
on the consumer's end. 

The Java operational semantics in rewriting logic that we have used is modu- 
lar and has 2635 lines of code in 4 files [21] . We have modified less than 20 of the 
1527 lines of code in the main file of the original Java semantics. The abstract 
operational Java semantics was developed as a source-to-sourcc transformation 
in rewriting logic and consists of 650 lines of extra code. This is equivalent to 
saying that, in our current system, the trusted computing base (TCB)0 is less 
than a fourth of the size of the original Java semantics (at least one order of 
magnitude smaller than the standard rewriting infrastructure, and even much 
smaller than other PCC systems). 

^ The TCB is the part of the code that is used to check if other code can be safely 
run, and it is assumed to be trusted. 
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Since our approach is based on a rewriting logic semantics specification of 
the full Java 1.4 language [33], the methodology developed in this work can be 
easily extended to cope with exceptions, heaps, and multithreading since they 
are considered in the Java rewriting logic semantics. 
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